62 research outputs found
Deep Virtual-to-Real Distillation for Pedestrian Crossing Prediction
Pedestrian crossing is one of the most typical behavior which conflicts with
natural driving behavior of vehicles. Consequently, pedestrian crossing
prediction is one of the primary task that influences the vehicle planning for
safe driving. However, current methods that rely on the practically collected
data in real driving scenes cannot depict and cover all kinds of scene
condition in real traffic world. To this end, we formulate a deep virtual to
real distillation framework by introducing the synthetic data that can be
generated conveniently, and borrow the abundant information of pedestrian
movement in synthetic videos for the pedestrian crossing prediction in real
data with a simple and lightweight implementation. In order to verify this
framework, we construct a benchmark with 4667 virtual videos owning about 745k
frames (called Virtual-PedCross-4667), and evaluate the proposed method on two
challenging datasets collected in real driving situations, i.e., JAAD and PIE
datasets. State-of-the-art performance of this framework is demonstrated by
exhaustive experiment analysis. The dataset and code can be downloaded from the
website \url{http://www.lotvs.net/code_data/}.Comment: Accepted by ITSC 202
Heterogeneous Trajectory Forecasting via Risk and Scene Graph Learning
Heterogeneous trajectory forecasting is critical for intelligent
transportation systems, while it is challenging because of the difficulty for
modeling the complex interaction relations among the heterogeneous road agents
as well as their agent-environment constraint. In this work, we propose a risk
and scene graph learning method for trajectory forecasting of heterogeneous
road agents, which consists of a Heterogeneous Risk Graph (HRG) and a
Hierarchical Scene Graph (HSG) from the aspects of agent category and their
movable semantic regions. HRG groups each kind of road agents and calculates
their interaction adjacency matrix based on an effective collision risk metric.
HSG of driving scene is modeled by inferring the relationship between road
agents and road semantic layout aligned by the road scene grammar. Based on
this formulation, we can obtain an effective trajectory forecasting in driving
situations, and superior performance to other state-of-the-art approaches is
demonstrated by exhaustive experiments on the nuScenes, ApolloScape, and
Argoverse datasets.Comment: Submitted to IEEE Transactions on Intelligent Transportation Systems,
202
Parallel Multistage Wide Neural Network
Deep learning networks have achieved great success in many areas such as in large scale image processing. They usually need large computing resources and time, and process easy and hard samples inefficiently in the same way. Another undesirable problem is that the network generally needs to be retrained to learn new incoming data. Efforts have been made to reduce the computing resources and realize incremental learning by adjusting architectures, such as scalable effort classifiers, multi-grained cascade forest (gc forest), conditional deep learning (CDL), tree CNN, decision tree structure with knowledge transfer (ERDK), forest of decision trees with RBF networks and knowledge transfer (FDRK). In this paper, a parallel multistage wide neural network (PMWNN) is presented. It is composed of multiple stages to classify different parts of data. First, a wide radial basis function (WRBF) network is designed to learn features efficiently in the wide direction. It can work on both vector and image instances, and be trained fast in one epoch using subsampling and least squares (LS). Secondly, successive stages of WRBF networks are combined to make up the PMWNN. Each stage focuses on the misclassified samples of the previous stage. It can stop growing at an early stage, and a stage can be added incrementally when new training data is acquired. Finally, the stages of the PMWNN can be tested in parallel, thus speeding up the testing process. To sum up, the proposed PMWNN network has the advantages of (1) fast training, (2) optimized computing resources, (3) incremental learning, and (4) parallel testing with stages. The experimental results with the MNIST, a number of large hyperspectral remote sensing data, CVL single digits, SVHN datasets, and audio signal datasets show that the WRBF and PMWNN have the competitive accuracy compared to learning models such as stacked auto encoders, deep belief nets, SVM, MLP, LeNet-5, RBF network, recently proposed CDL, broad learning, gc forest etc. In fact, the PMWNN has often the best classification performance
- …